Picture for Shilong Liu

Shilong Liu

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Add code
May 27, 2026
Viaarxiv icon

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

Add code
May 14, 2026
Viaarxiv icon

Wan-R1: Verifiable-Reinforcement Learning for Video Reasoning

Add code
Mar 29, 2026
Viaarxiv icon

UI-Mem: Self-Evolving Experience Memory for Online Reinforcement Learning in Mobile GUI Agents

Add code
Feb 05, 2026
Viaarxiv icon

Avenir-Web: Human-Experience-Imitating Multimodal Web Agents with Mixture of Grounding Experts

Add code
Feb 02, 2026
Viaarxiv icon

MiLDEdit: Reasoning-Based Multi-Layer Design Document Editing

Add code
Jan 08, 2026
Viaarxiv icon

CubeBench: Diagnosing Interactive, Long-Horizon Spatial Reasoning Under Partial Observations

Add code
Dec 30, 2025
Viaarxiv icon

Web World Models

Add code
Dec 29, 2025
Viaarxiv icon

AMS-IO-Bench and AMS-IO-Agent: Benchmarking and Structured Reasoning for Analog and Mixed-Signal Integrated Circuit Input/Output Design

Add code
Dec 25, 2025
Viaarxiv icon

SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features

Add code
Sep 19, 2025
Viaarxiv icon